Graphical Representation of Textual Data Using Text Categorization System
نویسندگان
چکیده
This paper presents the graphical representation of textual data using text categorization; we had concentrated on the compact representation of the document. Text Categorization has become an important task in data mining (text mining) because of the development of electronic commerce over the internet. All organizations that have business based on internet need an effective categorization method for managing large amount of textual data which is available in various forms like sales orders, summary documents, emails, journals and memos etc. Here we have used both globalized as well as localized feature selection methods. The localized method that we have introduced has also improved the accuracy of the classifier. The classifier that we have used is K-NN that is K nearest neighbor. The K-NN is simple and is having better precision in classifying a document. Also this K-NN does not need any training resources or model to be built up and it categorizes on the fly .Therefore its cost is also less as no resources need to be trained and accuracy is also better than any other classifier.
منابع مشابه
When Naïve is not Enough: Bringing Naïve Bayes Text Categorization to "Surface"
Since information has become more and more available in digital format, especially on the World Wide Web, organizing and classifying digital documents, making them accessible and presenting them in a proper way are becoming important issues. Digital Library Management Systems (DLMSs) are an example of systems that manage collections of multi-media digitalized data and include components that pe...
متن کاملOpinion Mining in Hungarian based on textual and graphical clues
Opinion Mining aims at recognizing and categorizing or extracting opinions found in unstructured text resources and is one of the most dynamically evolving subdiscipline of Computational Linguistics showing some resemblance to document classification and information extraction tasks. In this paper we propose a novel approach in Opinion Mining which combines Machine Learning models based on trad...
متن کاملTextual Data Representation
We address in this report the problem of representing formally textual data. First, this problem is replaced in the context of automatic text processing. Then, the weaknesses of the basic document representation, i.e. the bag-of-words representation, are explained and some state-ofthe-art methods claiming to overcome these weaknesses are reviewed. Moreover we propose a novel graphical model, th...
متن کاملThe Effect of Visual Representation, Textual Representation, and Glossing on Second Language Vocabulary Learning
In this study, the researcher chose three different vocabulary techniques (Visual Representation, Textual Enhancement, and Glossing) and compared them with traditional method of teaching vocabulary. 80 advanced EFL Learners were assigned as four intact groups (three experimental and one control group) through using a proficiency test and a vocabulary test as a pre-test. In the visual group, stu...
متن کاملTextual Modelling Embedded into Graphical Modelling
Today’s graphical modelling languages, despite using symbols and connections, represent large model parts as structured text. We benefit from sophistic text editors, when we use programming languages, but we neglect the same technology, when we edit the textual parts of graphical models. Recent advances in generative engineering of textual model editors allow to create such sophisticated text e...
متن کامل